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Abstract 

The article reviews the statistical theory of signal detection in application to analysis of 
deterministic gravitational-wave signals in the noise of a detector. Statistical foundations for 
the theory of signal detection and parameter estimation are presented. Several tools needed 
for both theoretical evaluation of the optimal data analysis methods and for their practical 
implementation are introduced. They include optimal signal-to-noise ratio, Fisher matrix, 
false alarm and detection probabilities, .7-"-statistic, template placement, and fitting factor. 
These tools apply to the case of signals buried in a stationary and Gaussian noise. Algorithms 
to efficiently implement the optimal data analysis techniques are discussed. Formulas are given 
for a general gravitational-wave signal that includes as special cases most of the deterministic 
signals of interest. 
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update (26 July 2007) 

Section[^and l4.4."T] have been extended, some notations were changed and Figure[l]was updated. 
Equations and (|36p have been corrected. 15 references were added. 

Page 4: Section [2] was shortly extended to describe more general responses of detectors to 
gravitational waves. For this some notations were changed and Figure [T] updated. 

Page 4: Figure [T] has been updated. 

Page 11: Equation (|28p has been corrected (replaced 6{x) by 9 on right hand side). 
Page 13: Equation (f36|) has been corrected (replaced In by log). 

Page 15: Section [4XT] was extended by adding a brief discussion of the results of usage of 
the Fisher matrix for estimating the parameter errors of the gravitational-wave signals. Added 
references to [Ml Hi HI 1301 Si [Ml IHTl HZ] and [H] • 

Page 22: Reference to ^ has been added. 

Page 24: References to [33] and [TS] have been added. 

Page 24: References to [3 [S] and [S3] have been added and briefly described. 



1 Introduction 

In this review we consider the problem of detection of deterministic gravitational-wave signals 
in the noise of a detector and the question of estimation of their parameters. The examples 
of deterministic signals are gravitational waves from rotating neutron stars, coalescing compact 
binaries, and supernova explosions. The case of detection of stochastic gravitational-wave signals 
in the noise of a detector is reviewed in [5] . A very powerful method to detect a signal in noise that 
is optimal by several criteria consists of correlating the data with the template that is matched to 
the expected signal. This matched-filtering technique is a special case of the maximum likelihood 
detection method. In this review we describe the theoretical foundation of the method and we 
show how it can be applied to the case of a very general deterministic gravitational-wave signal 
buried in a stationary and Gaussian noise. 

Early gravitational-wave data analysis was concerned with the detection of bursts originating 
from supernova explosions [S^. It involved analysis of the coincidences among the detectors [52) . 
With the growing interest in laser interferometric gravitational-wave detectors that are broadband 
it was realized that sources other than supernovae can also be detectable j92j and that they 
can provide a wealth of astrophysical information [55] [SS]. For example the analytic form of 
the gravitational-wave signal from a binary system is known in terms of a few parameters to a 
good approximation. Consequently one can detect such a signal by correlating the data with 
the waveform of the signal and maximizing the correlation with respect to the parameters of the 
waveform. Using this method one can pick up a weak signal from the noise by building a large 
signal-to-noise ratio over a wide bandwidth of the detector [22] ■ This observation has led to a 
rapid development of the theory of gravitational-wave data analysis. It became clear that the 
detectability of sources is determined by optimal signal-to- noise ratio, Equation (j24p . which is 
the power spectrum of the signal divided by the power spectrum of the noise integrated over the 
bandwidth of the detector. 

An important landmark was a workshop entitled Gravitational Wave Data Analysis held in 
Dyffryn House and Gardens, St. Nicholas near Cardiff, in July 1987 [86^- The meeting acquainted 
physicists interested in analyzing gravitational-wave data with the basics of the statistical theory 
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of signal detection and its application to detection of gravitational-wave sources. As a result of 
subsequent studies the Fisher information matrix was introduced to the theory of the analysis of 
gravitational- wave data [JHl HE] ■ The diagonal elements of the Fisher matrix give lower bounds 
on the variances of the estimators of the parameters of the signal and can be used to assess the 
quality of astrophysical information that can be obtained from detections of gravitational-wave 
signals [321 EH HI]- It was also realized that application of matched-filtering to some sources, 
notably to continuous sources originating from neutron stars, will require extraordinary large 
computing resources. This gave a further stimulus to the development of optimal and efficient 
algorithms and data analysis methods 'WT . 

A very important development was the work by Cutler et al. [31j where it was realized that for 
the case of coalescing binaries matched filtering was sensitive to very small post-Newtonian effects 
of the waveform. Thus these effects can be detected. This leads to a much better verification of 
Einstein's theory of relativity and provides a wealth of astrophysical information that would make a 
laser interferometric gravitational-wave detector a true astronomical observatory complementary to 
those utilizing the electromagnetic spectrum. As further developments of the theory methods were 
introduced to calculate the quality of suboptimal filters [9] , to calculate the number of templates to 
do a search using matched-filtering [74] , to determine the accuracy of templates required [24] , and 
to calculate the false alarm probability and thresholds '50' . An important point is the reduction 
of the number of parameters that one needs to search for in order to detect a signal. Namely 
estimators of a certain type of parameters, called extrinsic parameters, can be found in a closed 
analytic form and consequently eliminated from the search. Thus a computationally intensive 
search needs only be performed over a reduced set of intrinsic parameters \58\ \50\ [60. . 

Techniques reviewed in this paper have been used in the data analysis of prototypes of gravitational- 
wave detectors [731 IHl H] and in the data analysis of presently working gravitational- wave detec- 
tors [90l[15l[3[2l[I]. 

We use units such that the velocity of light c = 1. 
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2 Response of a Detector to a Gravitational Wave 

There are two main methods to detect gravitational waves which have been implemented in the 
currently working instruments. One method is to measure changes induced by gravitational waves 
on the distances between freely moving test masses using coherent trains of electromagnetic waves. 
The other method is to measure the deformation of large masses at their resonance frequencies 
induced by gravitational waves. The first idea is realized in laser interferometric detectors and 
Doppler tracking experiments [82 [ I65j whereas the second idea is implemented in resonant mass 
detectors [TB] . 

Let us consider the response to a plane gravitational wave of a freely falling configuration of 
masses. It is enough to consider a configuration of three masses shown in Figure [T] to obtain the 
response for all currently working and planned detectors. Two masses model a Doppler tracking 
experiment where one mass is the Earth and the other one is a distant spacecraft. Three masses 
model a ground-based laser interferometer where the masses are suspended from seismically isolated 
supports or a space-borne interferometer where the three masses are shielded in satellites driven 
by drag-free control systems. 



Figure 1: Schematic configuration of three freely falling masses as a detector of gravitational waves. 
The masses are labelled 1, 2, and 3, their positions with respect to the origin O of the coordinate 
system are given by vectors Xa (a = 1,2,3). The Euclidean separations between the masses are 
denoted by La, where the index a corresponds to the opposite mass. The unit vectors lia point 
between pairs of masses, with the orientation indicated. 

In Figuredlwe have introduced the following notation: O denotes the origin of the TT coordinate 
system related to the passing gravitational wave, x^ (a — 1, 2, 3) are 3-vectors joining O and the 
masses, Ua and La (a = 1, 2, 3) are, respectively, 3-vectors of unit Euclidean length along the lines 
joining the masses and the coordinate Euclidean distances between the masses, where a is the label 
of the opposite mass. Let us also denote by k the unit 3-vector directed from the origin O to the 
source of the gravitational wave. We first assume that the spatial coordinates of the masses do not 
change in time. 

Let vq be the frequency of the coherent beam used in the detector (laser light in the case of 
an interferometer and radio waves in the case of Doppler tracking). Let y2i be the relative change 
Av/vq of frequency induced by a transverse, traceless, plane gravitational wave on the coherent 
beam travelling from the mass 2 to the mass 1, and let y^i be the relative change of frequency 
induced on the beam travelling from the mass 3 to the mass 1. The frequency shifts j/21 and y^i 
are given by [37l |TOl [83] 
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2/31 W = (1 + k • ns) [^2{t + k • X3 - L2) - *2(i + k • xi) j , (2) 
where (here T denotes matrix transposition) 

_ nj ■ Hft) -n, „ _ 1 2 3 f3) 
2(l-(k.n,)2)' 

In Equation ([3]) H is the three-dimensional matrix of the spatial metric perturbation produced by 
the wave in the TT coordinate system. If one chooses spatial TT coordinates such that the wave 
is travelling in the +z direction, then the matrix H is given by 

'h+{t) hy,{t) 0\ 

m=\hAt)-h+{t)Q\, (4) 
0/ 

where /i+ and are the two polarizations of the wave. 

Real gravitational-wave detectors do not stay at rest with respect to the TT coordinate system 
related to the passing gravitational wave, because they also move in the gravitational field of the 
solar system bodies, as in the case of the LISA spacecraft, or are fixed to the surface of Earth, as 
in the case of Earth-based laser interferometers or resonant bar detectors. Let us choose the origin 
O of the TT coordinate system to coincide with the solar system bary center (SSB). The motion of 
the detector with respect to the SSB will modulate the gravitational- wave signal registered by the 
detector. One can show that as far as the velocities of the masses (modelling the detector's parts) 
with respect to the SSB are nonrelativistic, which is the case for all existing or planned detectors, 
the Equations ([T]) and ^ can still be used, provided the 3-vectors x^ and ria (a = 1,2,3) will be 
interpreted as made of the time-dependent components describing the motion of the masses with 
respect to the SSB. 

It is often convenient to introduce the proper reference frame of the detector with coordinates 
{x°'). Because the motion of this frame with respect to the SSB is nonrelativistic, we can assume 
that the transformation between the SSB-related coordinates (x") and the detector's coordinates 
(x") has the form 

i^t, x' = x'q (t) + O'j (t) x\ (5) 

where the functions a;g(i) describe the motion of the origin O of the proper reference frame with 

respect to the SSB, and the functions 0'^j{t) account for the different orientations of the spatial 
axes of the two reference frames. One can compute some of the quantities entering Equations ^ 
and (21) in the detector's coordinates rather than in the TT coordinates. For instance, the matrix 
H of the wave-induced spatial metric perturbation in the detector's coordinates is related to the 
matrix H of the spatial metric perturbation produced by the wave in the TT coordinate system 
through equation 

m = {0{t)-'V-H{t)-0{t)-\ (6) 

where the matrix has elements 0*j. If the transformation matrix is orthogonal, then 0^^ = 0^, 
and Equation ([6]) simplifies to 

m = 0{t)-H{t)-0{t)\ (7) 

See [231142] [501 [60] for more details. 

For a standard Michelson, equal-arm interferometric configuration Ai^ is given in terms of one- 
way frequency changes j/21 and 2/31 (see Equations ([T]) and ([2]) with L2 = £3 = L, where we assume 
that the mass 1 corresponds to the corner station of the interferometer) by the expression |93j 

— = iysiit) + Visit - L)) - (2/2i(i) + Vi2{t - L)) . (8) 
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In the long-wavelength approximation Equation ([8]) reduces to 

Av f dHm ^ dH(i) \ 

The difference of the phase fluctuations A(/)(t) measured, say, by a photo detector, is related to the 
corresponding relative frequency fluctuations Au by 

Av ^ 1 dA0(t) ^ ^^^^ 
V{) 2ttvq dt 

By virtue of Equation ([9]) the phase change can be written as 

A(t){t)=ATTVoLh{t), (11) 

where the function 

h{t) := 1 (nj • H(t) • n3 - . H(t) • na) , (12) 

is the response of the interferometer to a gravitational wave in the long- wavelength approximation. 
In this approximation the response of a laser interferometer is usually derived from the equation 
of geodesic deviation (then the response is defined as the difference between the relative wave- 
induced changes of the proper lengths of the two arms, i.e., h{t) AL{t)/L). There are important 
cases where the long-wavelength approximation is not valid. These include the space-borne LISA 
detector for gravitational-wave frequencies larger than a few mHz and satellite Doppler tracking 
measurements. 

In the case of a bar detector the long-wavelength approximation is very accurate and the 
detector's response is defined as hB{t) := AL{t)/L, where AL is the wave-induced change of the 
proper length L of the bar. The response /ib is given by 

/iB(t) =n'^ • H(i) -n, (13) 

where n is the unit vector along the symmetry axis of the bar. 

In most cases of interest the response of the detector to a gravitational wave can be written as 
a linear combination of four constant amplitudes a''^' , 

4 

h{t;a(''\e) = ^a('=)/i('=Ht;^^) = • h(t;^''), (14) 
fe=i 

where the four functions /i'-'^) depend on a set of parameters but are independent of the pa- 
rameters a^''\ The parameters a^*^-* are called extrinsic parameters whereas the parameters are 
called intrinsic. In the long-wavelength approximation the functions ft.^'^-' are given by 

Mi)(t;e) = u(i;e^)cos0(t;e'^), 

M2)(t;e) = ^;(t;?'^)cos0(t;e'^), 

. s (15) 
/i(3)(t;e) = u(i;^M)sin0(f;C^), 

h(^Ht;e) = v{t;e)sin<P{t;^n: 

where (^{t; is the phase modulation of the signal and u{t; ^^), v{t; ^^) are slowly varying ampli- 
tude modulations. 

Equation is a model of the response of the space-based detector LISA to gravitational 
waves from a binary system [60] , whereas Equation (jlSp is a model of the response of a ground- 
based detector to a continuous source of gravitational waves like a rotating neutron star ^0] . The 
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gravitational-wave signal from spinning neutron stars may consist of several components of the 
form (|14p . For short observation times over which the amplitude modulation functions are nearly 
constant, the response can be approximated by 

h{t- Ao, 00, = ^0 9{t\ e') cos e) - <Po) , (16) 

where Aq and 0o are constant amplitude and initial phase, respectively, and ) is a slowly 

varying function of time. Equation (flBj) is a good model for a response of a detector to the 
gravitational wave from a coalescing binary system [92l |22] ■ We would like to stress that not all 
deterministic gravitational- wave signals may be cast into the general form p4p . 
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3 Statistical Theory of Signal Detection 



The gravitational-wave signal will be buried in the noise of the detector and the data from the 
detector will be a random process. Consequently the problem of extracting the signal from the 
noise is a statistical one. The basic idea behind the signal detection is that the presence of the 
signal changes the statistical characteristics of the data x, in particular its probability distribution. 
When the signal is absent the data have probability density function (pdf) pq{x)^ and when the 
signal is present the pdf is pi{x). 

A full exposition of the statistical theory of signal detection that is outlined here can be found 
in the monographs ^102 , 56, 98, [SHI SH [77] . A general introduction to stochastic processes is 
given in [100] . Advanced treatment of the subject can be found in [64l 1101] . 

The problem of detecting the signal in noise can be posed as a statistical hypothesis testing 
problem. The null hypothesis Hq is that the signal is absent from the data and the alternative 
hypothesis Hi is that the signal is present. A hypothesis test (or decision rule) 6 is a partition of 
the observation set into two sets, TZ and its complement TZ' . If data are in TZ we accept the null 
hypothesis, otherwise we reject it. There are two kinds of errors that we can make. A type I error is 
choosing hypothesis Hi when Hq is true and a type II error is choosing Hq when Hi is true. In signal 
detection theory the probability of a type I error is called the false alarm probability, whereas the 
probability of a type II error is called the false dismissal probability. 1 — (false dismissal probability) 
is the probability of detection of the signal. In hypothesis testing the probability of a type I error 
is called the significance of the test, whereas 1 — (probability of type II error) is called the power 
of the test. 

The problem is to find a test that is in some way optimal. There are several approaches to 
find such a test. The subject is covered in detail in many books on statistics, for example see 
references [5illiT1 [5^. 

3.1 Bayesian approach 

In the Bayesian approach we assign costs to our decisions; in particular we introduce positive 
numbers Cij, i,j = 0, 1, where Cy is the cost incurred by choosing hypothesis Hi when hypothesis 
Hj is true. We define the conditional risk i? of a decision rule S for each hypothesis as 

Rj (S) = CojPj (7^) + CijP, in'), J - 0, 1, (i7) 

where Pj is the probability distribution of the data when hypothesis Hj is true. Next we assign 
probabilities ttq and tti = 1 — ttq to the occurrences of hypothesis Hq and Hi , respectively. These 
probabilities are called a priori probabilities or priors. We define the Bayes risk as the overall 
average cost incurred by the decision rule 6: 

r{d) =7:oRoiS)+TriRiid). (18) 

Finally we define the Bayes rule as the rule that minimizes the Bayes risk r{S). 

3.2 Minimax approach 

Very often in practice we do not have the control over or access to the mechanism generating the 
state of nature and we are not able to assign priors to various hypotheses. In such a case one 
criterion is to seek a decision rule that minimizes, over all S, the maximum of the conditional risks, 
Ro{S) and Ri{6). A decision rule that fulfills that criterion is called minimax rule. 
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3.3 Ney man— Pearson approach 

In many problems of practical interest the imposition of a specific cost structure on the decisions 
made is not possible or desirable. The Neyman-Pearson approach involves a trade-off between the 
two types of errors that one can make in choosing a particular hypothesis. The Neyman-Pearson 
design criterion is to maximize the power of the test (probability of detection) subject to a chosen 
significance of the test (false alarm probability). 

3.4 Likelihood ratio test 

It is remarkable that all three very different approaches - Bayesian, minimax, and Neyman-Pearson 
- lead to the same test called the likelihood ratio test [34]. The likelihood ratio A is the ratio of 
the pdf when the signal is present to the pdf when it is absent: 

A(.) (19) 
Po(x) 

We accept the hypothesis Hi ii A > k, where k is the threshold that is calculated from the costs 
Cij, priors tt^, or the significance of the test depending on what approach is being used. 

3.4.1 Gaussian case The matched filter 

Let h be the gravitational-wave signal and let n be the detector noise. For convenience we assume 
that the signal ft, is a continuous function of time t and that the noise n is a continuous random 
process. Results for the discrete time data that we have in practice can then be obtained by a 
suitable sampling of the continuous-in-time expressions. Assuming that the noise is additive the 
data X can be written as 

x{t)=n{t) + h{t). (20) 

In addition, if the noise is a zero-mean^ stationary^ and Gaussian random process, the log likelihood 
function is given by 

logA=(x|/i)-i(ft|ft), (21) 
where the scalar product ( • | • ) is defined by 

ix\y):=A^ r^^iKilldf. (22) 
Jo S[f) 

In Equation Jft denotes the real part of a complex expression, the tilde denotes the Fourier 
transform, the asterisk is complex conjugation, and S is the one-sided spectral density of the noise 
in the detector, which is defined through equation 

E[n(/)r(/')]-i5(/-m/), (23) 

where E denotes the expectation value. 

From the expression ([2T|l we see immediately that the likelihood ratio test consists of correlating 
the data x with the signal h that is present in the noise and comparing the correlation to a threshold. 
Such a correlation is called the matched filter. The matched filter is a linear operation on the data. 

An important quantity is the optimal signal-to-noise ratio p defined by 

p^:^ih\h)^ml^^Hljldf. (24) 
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We see in the following that p determines the probability of detection of the signal. The higher 
the signal-to-noise ratio the higher the probability of detection. 

An interesting property of the matched filter is that it maximizes the signal-to-noise ratio over 
all linear filters [34] . This property is independent of the probability distribution of the noise. 
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4 Parameter Estimation 



Very often we know the waveform of the signal that we are searching for in the data in terms of 
a finite number of unknown parameters. We would like to find optimal procedures of estimating 
these parameters. An estimator of a parameter is a function 9{x) that assigns to each data the 
"best" guess of the true value of 6. Note that because 9{x) depends on the random data it is a 
random variable. Ideally we would like our estimator to be (i) unbiased, i.e., its expectation value 
to be equal to the true value of the parameter, and (ii) of minimum variance. Such estimators are 
rare and in general difhcult to find. As in the signal detection there are several approaches to the 
parameter estimation problem. The subject is exposed in detail in reference [63j . See also [103\ 
for a concise account. 



4.1 Bayesian estimation 

We assign a cost function C{9', 9) of estimating the true value of 9 as 6' . We then associate with 
an estimator 9 a conditional risk or cost averaged over all realizations of data x for each value of 
the parameter 9: 

Re{9)=Eg[C{0,9)]^ J C (§{x),9^ p{x,9) dx, (25) 

where X is the set of observations and p{x, 9) is the joint probability distribution of data x and 
parameter 9. We further assume that there is a certain a priori probability distribution 7r(0) of 
the parameter 9. We then define the Bayes estimator as the estimator that minimizes the average 
risk defined as 

r{§) = E[Rg{9)] = J jy(9{x),9^p{x,9)TT{9)d9dx, (26) 

where E is the expectation value with respect to an a priori distribution tt, and Q is the set of 
observations of the parameter 9. It is not difficult to show that for a commonly used cost function 

C{9',9) = {9'-ef, (27) 

the Bayesian estimator is the conditional mean of the parameter 9 given data x, i.e., 

9{x)^^[9\x\^ I 9p{9\x)d9, (28) 



J0 

where p{9\x) is the conditional probability density of parameter 9 given the data x. 
4.2 Maximum a posteriori probability estimation 

Suppose that in a given estimation problem we are not able to assign a particular cost function 
C{9',9). Then a natural choice is a uniform cost function equal to over a certain interval Ig of 
the parameter 9. From Bayes theorem j20j we have 

where p{x) is the probability distribution of data x. Then from Equation (j26p one can deduce that 
for each data x the Bayes estimate is any value of 9 that maximizes the conditional probability 
p{9\x). The density p{9\x) is also called the a posteriori probability density of parameter 9 and 
the estimator that maximizes p{9\x) is called the maximum a posteriori (MAP) estimator. It is 
denoted by 6'map- We find that the MAP estimators are solutions of the following equation 

d\ogp{x,9) _ d\ogTT{9) 

d9 ~ 89 ' ^^"^ 
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which is called the MAP equation. 



4.3 Maximum likelihood estimation 

Often we do not know the a priori probability density of a given parameter and we simply assign 
to it a uniform probability. In such a case maximization of the a posteriori probability is equiv- 
alent to maximization of the probability density p{x, 9) treated as a function of 9. We call the 
function x) := p{x, 9) the likelihood function and the value of the parameter 9 that maximizes 
l{9,x) the maximum likelihood (ML) estimator. Instead of the function I we can use the function 
A{9,x) = l{9,x)/p{x) (assuming that p{x) > 0). A is then equivalent to the likelihood ratio [see 
Equation when the parameters of the signal are known. Then the ML estimators are obtained 
by solving the equation 

(31) 

which is called the ML equation. 



4.3.1 Gaussian case 

For the general gravitational-wave signal defined in Equation (fH)) the log likelihood function is 
given by 

logA = a^-N-ia'^-M-a, (32) 
where the components of the column vector N and the matrix M are given by 

iV^*^) := ix\h^'''^), mC^')'" := (/le^'^l/jf')), (33) 

with x{t) — n{t) + h{t), and where n{t) is a zero-mean Gaussian random process. The ML equations 
for the extrinsic parameters a can be solved explicitly and their ML estimators a are given by 

a=M-i-N. (34) 

Substituting a into log A we obtain a function 

jF= In^ . -N, (35) 

that we call the JF-statistic. The JF-statistic depends (nonlinearly) only on the intrinsic parameters 

Thus the procedure to detect the signal and estimate its parameters consists of two parts. The 
first part is to find the (local) maxima of the J^-statistic in the intrinsic parameter space. The ML 
estimators of the intrinsic parameters are those for which the JF-statistic attains a maximum. The 
second part is to calculate the estimators of the extrinsic parameters from the analytic formula (|34p , 
where the matrix M and the correlations N are calculated for the intrinsic parameters equal to their 
ML estimators obtained from the first part of the analysis. We call this procedure the maximum 
likelihood detection. See Section 14.81 for a discussion of the algorithms to find the (local) maxima 
of the J^-statistic. 



4.4 Fisher information 

It is important to know how good our estimators are. We would like our estimator to have as small 
variance as possible. There is a useful lower bound on variances of the parameter estimators called 
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Cramer-Rao bound. Let us first introduce the Fisher information matrix T witli tlic components 
defined by 

"91ogA91ogA] _^[9^1ogA" 



Tij E 



= -E 



(36) 



The Cramer-Rao bound states that for unbiased estimators the covariance matrix of the estimators 
C > r^^. (The inequality A > B for matrices means that the matrix A—B is nonnegative definite.) 

A very important property of the ML estimators is that asymptotically (i.e., for a signal-to- 
noise ratio tending to infinity) they are (i) unbiased, and (ii) they have a Gaussian distribution 
with covariance matrix equal to the inverse of the Fisher information matrix. 



4.4.1 Gaussian case 

In the case of Gaussian noise the components of the Fisher matrix are given by 



dh 



(37) 



For the case of the general gravitational-wave signal defined in Equation the set of the signal 
parameters splits naturally into extrinsic and intrinsic parameters: 9 = (a^*^-*,^^). Then the 
Fisher matrix can be written in terms of block matrices for these two sets of parameters as 

r = ( nT .,T c (38) 



a^ FTa^ Sa^ 

where the top left block corresponds to the extrinsic parameters, the bottom right block corre- 
sponds to the intrinsic parameters, the superscript T denotes here transposition over the extrinsic 
parameter indices, and the dot stands for the matrix multiplication with respect to these parame- 
ters. Matrix M is given by Equation (|33p . and the matrices F and S are defined as follows: 



de J 



(39) 



The covariance matrix C, which approximates the expected covariances of the ML parameter 
estimators, is defined as F^^. Using the standard formula for the inverse of a block matrix [67] we 
have 

/M-i + M-i • (F • a) • F-i • (F- a)T . M'^ -M'^ ■ (F- a) -F-iX 

-f-.(F.a)T.M- f- j' ^''^ 

where 

f := a'^ • (S- F'^ • • F) • a. (41) 

We call F'^" (the Schur complement of M) the projected Fisher matrix (onto the space of intrin- 
sic parameters). Because the projected Fisher matrix is the inverse of the intrinsic-parameter 
submatrix of the covariance matrix C, it expresses the information available about the intrinsic 
parameters that takes into account the correlations with the extrinsic parameters. Note that f '^'^ 
is still a function of the putative extrinsic parameters. 
We next define the normalized projected Fisher matrix 

f - r _ aT-(S-F^.M-^.F).a 
a^ • M • a 



where p = V ■ M ■ a is the signal-to-noise ratio. From the Rayleigh principle [57] follows that 
the minimum value of the component F^" is given by the smallest eigenvalue (taken with respect 
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to the extrinsic parameters) of the matrix ((S — F"'' • • F) • M^^)'"'. Similarly, the maximum 
value of the component f is given by the largest eigenvalue of that matrix. Because the trace of 
a matrix is equal to the sum of its eigenvalues, the matrix 

f itr [(S- F^ • • F) • M-i] , (43) 

where the trace is taken over the extrinsic-parameter indices, expresses the information available 
about the intrinsic parameters, averaged over the possible values of the extrinsic parameters. Note 
that the factor 1/4 is specific to the case of four extrinsic parameters. We call T'^'^ the reduced 
Fisher matrix. This matrix is a function of the intrinsic parameters alone. We see that the reduced 
Fisher matrix plays a key role in the signal processing theory that we review here. It is used in the 
calculation of the threshold for statistically significant detection and in the formula for the number 
of templates needed to do a given search. 
For the case of the signal 

h{t; Ao,cl>o,e) = ^0 9{t; e) cos {cf>{t; - M , (44) 

the normalized projected Fisher matrix r„ is independent of the extrinsic parameters Aq and 00, 
and it is equal to the reduced matrix F [74 . The components of F are given by 

^ n 1 



p,,. uu (45) 

r n 





where Fq is the Fisher matrix for the signal g{t; ^^) cos {(f>{t; ^^) — (po). 

Fisher matrix has been extensively used to assess the accuracy of estimation of astrophysically 
interesting parameters of gravitational-wave signals. First calculations of Fisher matrix concerned 
gravitational- wave signals from inspiralling binaries in quadrupole approximation [40. ,58 and from 
quasi- normal modes of Kerr black hole [3S]. Cutler and Flanagan [32] initiated the study of 
the implications of higher FN order phasing formula as applied to the parameter estimation of 
inspiralling binaries. They used the 1.5PN phasing formula to investigate the problem of parameter 
estimation, both for spinning and non-spinning binaries, and examined the effect of the spin-orbit 
coupling on the estimation of parameters. The effect of the 2PN phasing formula was analyzed 
independently by Poisson and Will [75] and Krolak, Kokkotas and Schafer ^57^ . In both of these 
works the focus was to understand the new spin-spin coupling term appearing at the 2PN order 
when the spins were aligned perpendicular to the orbital plane. Compared to [57], [76] also included 
a priori information about the magnitude of the spin parameters, which then leads to a reduction 
in the rms errors in the estimation of mass parameters. The case of 3.5PN phasing formula was 
studied in detail by Arun et al. |tl2j. Inclusion of 3.5PN effects leads to an improved estimate of 
the binary parameters. Improvements are relatively smaller for lighter binaries. 

Various authors have investigated the accuracy with which LISA detector can determine binary 
parameters including spin effects. Cutler [^C determined LISA's angular resolution and evaluated 
the errors of the binary masses and distance considering spins aligned or anti-aligned with the 
orbital angular momentum. Hughes [46] investigated the accuracy with which the redshift can be 
estimated (if the cosmological parameters are derived independently), and considered the black- 
hole ring-down phase in addition to the inspiralling signal. Seto [89^ included the effect of finite 
armlength (going beyond the long wavelength approximation) and found that the accuracy of the 
distance determination and angular resolution improve. This happens because the response of the 
instrument when the armlength is finite depends strongly on the location of the source, which 
is tightly correlated with the distance and the direction of the orbital angular momentum. Vec- 
chio [HZ] provided the first estimate of parameters for processing binaries when only one of the two 
supermassive black holes carries spin. He showed that modulational effects decorrelate the binary 
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parameters to some extent, resulting in a better estimation of the parameters compared to the case 
when spins are ahgned or antiahgned with orbital angular momentum. Hughes and Menou [37] 
studied a class of binaries, which they called "golden binaries," for which the inspiral and ring-down 
phases could be observed with good enough precision to carry out valuable tests of strong-field 
gravity. Berti, Buonanno and Will [21] have shown that inclusion of non-precessing spin-orbit and 
spin-spin terms in the gravitational-wave phasing generally reduces the accuracy with which the 
parameters of the binary can be estimated. This is not surprising, since the parameters are highly 
correlated, and adding parameters effectively dilutes the available information. 



4.5 False alarm and detection probabilities — Gaussian case 
4.5.1 Statistical properties of the .^-statistic 

We first present the false alarm and detection pdfs when the intrinsic parameters of the signal are 
known. In this case the statistic is a quadratic form of the random variables that are correlations 
of the data. As we assume that the noise in the data is Gaussian and the correlations are linear 
functions of the data, is a quadratic form of the Gaussian random variables. Consequently J-- 
statistic has a distribution related to the distribution. One can show (see Section III B in [49] ) 
that for the signal given by Equation (fT4]) , 2JF has a distribution with 4 degrees of freedom when 
the signal is absent and noncentral distribution with 4 degrees of freedom and non-centrality 
parameter equal to signal-to-noise ratio {h\h) when the signal is present. 

As a result the pdfs po and pi of when the intrinsic parameters are known and when respec- 
tively the signal is absent and present are given by 

■pn/2-l 

Poi^) = T-llT-^ exp(-.F), (46) 



(n/2-1)! 

P^(P^^) = ^^"^W2-i In/2-1 [pV2T] exp l-J^--p'\, (47) 



( 2 ■f)(n/ 2-1)/ 2 / 1 

^ In/2-1 [pV2T) exp [-T -^^ 

where n is the number of degrees of freedom of x^ distributions and In/2-1 is the modified Bessel 
function of the first kind and order n/2 — 1. The false alarm probability Pf is the probability that 
exceeds a certain threshold J^q when there is no signal. In our case we have 

roo "/2-1 k 

Pf(.Fo):= / po(^)d^ = exp(-.Fo) ^ (48) 

The probability of detection Po is the probability that J- exceeds the threshold J-q when the 
signal-to-noise ratio is equal to p: 

/>oo 

Pb{p,:Fo) / pi{p,T)dT. (49) 

The integral in the above formula can be expressed in terms of the generalized Marcum Q- 
function [MJHJ, (5(a,/3) = Pd(Q!, /3'^/2). We see that when the noise in the detector is Gaussian 
and the intrinsic parameters are known, the probability of detection of the signal depends on a 
single quantity: the optimal signal-to-noise ratio p. 



4.5.2 False alarm probability 

Next we return to the case when the intrinsic parameters ^ are not known. Then the statistic 
given by Equation (l35]) is a certain generalized multiparameter random process called the 
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random field (see Adler's monograph [4] for a comprehensive discussion of random fields). If the 
vector ^ has one component the random field is simply a random process. For random fields we 
can define the autocovariance function C just in the same way as we define such a function for a 
random process: 

C(t I') := Eom)na MHmom% (50) 

where ^ and ^' are two values of the intrinsic parameter set, and Eq is the expectation value when 
the signal is absent. One can show that for the signal (|14p the autocovariance function C is given 
by 

(51) 



where 



Q 



(fc)(0 



C(l,^') = Jtr(QT.M-i.Q.M'-i), 
(/iW(t;0|/^("(i;l')) , M'WW := {hC^Ht-.ah^Ht;^')) 



(52) 



We have C(^,|) = 1. 

One can estimate the false alarm probability in the following way [50]. The autocovariance 
function C tends to zero as the displacement = — ^ increases (it is maximal for = 0). Thus 
we can divide the parameter space into elementary cells such that in each cell the autocovariance 
function C is appreciably different from zero. The realizations of the random field within a cell will 
be correlated (dependent), whereas realizations of the random field within each cell and outside the 
cell are almost uncorrelated (independent). Thus the number of cells covering the parameter space 
gives an estimate of the number of independent realizations of the random field. The correlation 
hypersurface is a closed surface defined by the requirement that at the boundary of the hypersurface 
the correlation C equals half of its maximum value. The elementary cell is defined by the equation 



1 



(53) 



for ^ at cell center and ^' on cell boundary. To estimate the number of cells we perform the Taylor 
expansion of the autocorrelation function up to the second-order terms; 



2 d£,[dC, 



As C attains its maximum value when ^ — ^' = 0, we have 



5C(|,r) 



= 0. 



Let us introduce the symmetric matrix 



Gij :— 



152C(|,0 



2 aeiae 



Then the approximate equation for the elementary cell is given by 

1 

2' 



Gy AC^ A^j 



(54) 



(55) 



(56) 



(57) 



It is interesting to find a relation between the matrix G and the Fisher matrix. One can show 
(see [50], Appendix B) that the matrix G is precisely equal to the reduced Fisher matrix F given 
by Equation 

Let K be the number of the intrinsic parameters. If the components of the matrix G are 
constant (independent of the values of the parameters of the signal) the above equation is an 
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equation for a hyperellipse. The if-dimensional Euclidean volume Vccii of the elementary cell 
defined by Equation (|57)l equals 

r(if/2 + i)Vd^' 

where F denotes the Gamma function. We estimate the number Nc of elementary cells by dividing 
the total Euclidean volume V of the if-dimensional parameter space by the volume Vccii of the 
elementary cell, i.e. we have 

Nc^T^- (59) 

The components of the matrix G are constant for the signal h{t; Aq, = cos — (po) 

when the phase (j)(t; ^'') is a linear function of the intrinsic parameters 

To estimate the number of cells in the case when the components of the matrix G are not 
constant, i.e. when they depend on the values of the parameters, we write Equation (|59p as 



(60) 



This procedure can be thought of as interpreting the matrix G as the metric on the parameter 
space. This interpretation appeared for the first time in the context of gravitational-wave data 
analysis in the work by Owen [71], where an analogous integral formula was proposed for the 
number of templates needed to perform a search for gravitational-wave signals from coalescing 
binaries. 

The concept of number of cells was introduced in [50] and it is a generalization of the idea of 
an effective number of samples introduced in 36] for the case of a coalescing binary signal. 

We approximate the probability distribution of in each cell by the probability po{T) 

when the parameters are known [in our case by probability given by Equation The values 

of the statistic J- in each cell can be considered as independent random variables. The probability 
that does not exceed the threshold !Fo in a given cell is 1 — Pf{J^o): where Pf{J^o) is given by 
Equation Consequently the probability that does not exceed the threshold JFq in 0,11 the 

Nc cells is [1 — Pf(-^o)]^''- The probability Pp that exceeds in one or more cell is thus given 

by 

P^{To) = l-[l-PATo)f'. (61) 

This by definition is the false alarm probability when the phase parameters are unknown. The 
number of false alarms Np is given by 

Nf = iVcPj(^o). (62) 

A different approach to the calculation of the number of false alarms using the Euler characteristic 
of level crossings of a random field is described in [32] ■ 

It was shown (see [53]) that for any finite and A^c, Equation (|CT]) provides an upper bound 
for the false alarm probability. Also in [5S] a tighter upper bound for the false alarm probability 
was derived by modifying a formula obtained by Mohanty [68] . The formula amounts essentially 
to introducing a suitable coefficient multiplying the number of cells Ac. 



4.5.3 Detection probability 

When the signal is present a precise calculation of the pdf of T is very difficult because the presence 
of the signal makes the data random process x{t) non-stationary. As a first approximation we 
can estimate the probability of detection of the signal when the parameters are unknown by the 
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probability of detection when the parameters of the signal are known [given by Equation ()49|) ]. 
This approximation assumes that when the signal is present the true values of the phase parameters 
fall within the cell where J- has a maximum. This approximation will be the better the higher the 
signal-to-noise ratio p is. 



4.6 Number of templates 

To search for gravitational-wave signals we evaluate the J^-statistic on a grid in parameter space. 
The grid has to be sufRciently fine such that the loss of signals is minimized. In order to estimate 
the number of points of the grid, or in other words the number of templates that we need to search 
for a signal, the natural quantity to study is the expectation value of the JF-statistic when the 
signal is present. We have 

E[J^] = i(4 + a'^-Q^-M'-i-Q-a). (63) 

The components of the matrix Q are given in Equation (|52p . Let us rewrite the expectation 
value (|63p in the following form, 

1 ( ^ , 2aT.QT.M'-i-Q-a 



where p is the signal-to-noise ratio. Let us also define the normalized correlation function 

aT.QT.M-i.Q-a 
^" ■= ^TW-^ ■ 

From the Rayleigh principle '67' it follows that the minimum of the normalized correlation function 
is equal to the smallest eigenvalue of the normalized matrix Q"'"-M'^^-Q-M^^, whereas the maximum 
is given by its largest eigenvalue. We define the reduced correlation function as 

C{L I') i tr (QT . • Q • M'-^) . (66) 

As the trace of a matrix equals the sum of its eigenvalues, the reduced correlation function C is 
equal to the average of the eigenvalues of the normalized correlation function Cn- In this sense 
we can think of the reduced correlation function as an "average" of the normalized correlation 
function. The advantage of the reduced correlation function is that it depends only on the intrinsic 
parameters ^, and thus it is suitable for studying the number of grid points on which the J^-statistic 
needs to be evaluated. We also note that the normalized correlation function C precisely coincides 
with the autocovariance function C of the .7^-statistic given by Equation (|5ip . 

Like in the calculation of the number of cells in order to estimate the number of templates we 
perform a Taylor expansion of C up to second order terms around the true values of the parameters, 
and we obtain an equation analogous to Equation (j57p . 



G,, AC.Ae, = 1-Co, (67) 

where G is given by Equation (j56p . By arguments identical to those in deriving the formula for 
the number of cells we arrive at the following formula for the number of templates: 



1 r(i\:/2 + i) /■ r- — 



(68) 



When Co = 1/2 the above formula coincides with the formula for the number of cells. Equa- 
tion ([60]). Here we would like to place the templates sufficiently closely so that the loss of signals 
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is minimized. Thus 1 — Co needs to be chosen sufficiently smah. The formula (|68|) for the number 
of templates assumes that the templates are placed in the centers of hyperspheres and that the 
hyperspheres fill the parameter space without holes. In order to have a tiling of the parameter 
space without holes we can place the templates in the centers of hypercubes which are inscribed 
in the hyperspheres. Then the formula for the number of templates reads 

For the case of the signal given by Equation (|16|) our formula for number of templates is 
equivalent to the original formula derived by Owen j74] . Owen [74] has also introduced a geometric 
approach to the problem of template placement involving the identification of the Fisher matrix 
with a metric on the parameter space. An early study of the template placement for the case of 
coalescing binaries can be found in [84l [35l \19\ . Applications of the geometric approach of Owen 
to the case of spinning neutron stars and supernova bursts are given in |241 lllj . 

The problem of how to cover the parameter space with the smallest possible number of tem- 
plates, such that no point in the parameter space lies further away from a grid point than a certain 
distance, is known in mathematical literature as the covering problem [28]. The maximum distance 
of any point to the next grid point is called the covering radius R. An important class of coverings 
are lattice coverings. We define a lattice in if -dimensional Euclidean space to be the set of 
points including such that if u and v are lattice points, then also u + v and u — v are lattice 
points. The basic building block of a lattice is called the fundamental region. A lattice covering is 
a covering of by spheres of covering radius i?, where the centers of the spheres form a lattice. 
The most important quantity of a covering is its thickness O defined as 

„ volume of one if-dimensional sphere 

9 := -. . (70) 

volume of the fundamental region 

In the case of a two-dimensional Euclidean space the best covering is the hexagonal covering and 
its thickness ~ 1.21. For dimensions higher than 2 the best covering is not known. We know 
however the best lattice covering for dimensions K < 23. These are so-called lattices which 
have a thickness Qa* equal to 

where Vk is the volume of the A'-dimensional sphere of unit radius. 

For the case of spinning neutron stars a 3-dimensional grid was constructed consisting of prisms 
with hexagonal bases |16J. This grid has a thickness around 1.84 which is much better than the 
cubic grid which has thickness of approximately 2.72. It is worse than the best lattice covering 
which has the thickness around 1.46. The advantage of an A*^^ lattice over the hypercubic lattice 
grows exponentially with the number of dimensions. 



4.7 Suboptimal filtering 

To extract signals from the noise one very often uses filters that are not optimal. We may have 
to choose an approximate, suboptimal filter because we do not know the exact form of the signal 
(this is almost always the case in practice) or in order to reduce the computational cost and to 
simplify the analysis. The most natural and simplest way to proceed is to use as our statistic the 
^-statistic where the filters h'i^{t; C) are the approximate ones instead of the optimal ones matched 
to the signal. In general the functions h'f,{t;C) will be different from the functions hk{t;^) used in 
optimal filtering, and also the set of parameters ^ will be different from the set of parameters ^ 
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in optimal filters. We call this procedure the suboptimal filtering and we denote the suboptimal 
statistic by J^s- 

We need a measure of how well a given suboptimal filter performs. To find such a measure we 
calculate the expectation value of the suboptimal statistic. We get 

E[.Fs] = i(4 + aT.Qj.Mr^-Q.-a), (72) 

where 

(/j'W(t;C)|/i'W(i;C)), 



(73) 



where p is the optimal signal-to-noise ratio. The expectation value E[jFs] reaches its maximum 
equal to 2 + p^/2 when the filter is perfectly matched to the signal. A natural measure of the 
performance of a suboptimal filter is the quantity FF defined by 



Q(fe)(0 (/,W(t;|)|/l'W(t;^)). 

Let us rewrite the expectation value E[jFs] in the following form, 

1 A , 2a^-Q?-Mri-Qs-a 




FF max W „ ^ . (75) 



We call the quantity FF the generalized fitting factor. 
In the case of a signal given by 

s{t;Ao,0=Aohit;^), (76) 

the generalized fitting factor defined above reduces to the fitting factor introduced by Aposto- 
latos 0: 

FF ^ iHt;m'ifX)) 

C ViKt;mit;i))V(.h'{t;C)\h'it;C)) 

The fitting factor is the ratio of the maximal signal-to-noise ratio that can be achieved with 
suboptimal filtering to the signal-to-noise ratio obtained when we use a perfectly matched, optimal 
filter. We note that for the signal given by Equation ([7S|) . FF is independent of the value of the 
amplitude Aq. For the general signal with 4 constant amplitudes it follows from the Rayleigh 
principle that the fitting factor is the maximum of the largest eigenvalue of the matrix Q""" • M'^^ • 
Q • over the intrinsic parameters of the signal. 
For the case of a signal of the form 

s{t; Ao, (j)o, I) = An cos (0(t; ^) + M , (78) 



where (f>o is a constant phase, the maximum over (f>o in Equation (j77p can be obtained analytically. 
Moreover, assuming that over the bandwidth of the signal the spectral density of the noise is 
constant and that over the observation time cos(0(i;^)) oscillates rapidly, the fitting factor is 
approximately given by 



FF = max 

c 



^^cos(0(i;|)-(/.'(t;C)) dt] + U \i ((/.(i; - <^'(i; 0) dt 



1/2 

• (79) 
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In designing suboptimal filters one faces the issue of how small a fitting factor one can accept. 
A popular rule of thumb is accepting FF = 0.97. Assuming that the amplitude of the signal and 
consequently the signal-to-noise ratio decreases inversely proportional to the distance from the 
source this corresponds to 10% loss of the signals that would be detected by a matched filter. 

Proposals for good suboptimal (search) templates for the case of coalescing binaries are given 
in [26l [91] and for the case spinning neutron stars in [49l [16] . 

4.8 Algorithms to calculate the ^-statistic 

4.8.1 The tvifo-step procedure 

In order to detect signals we search for threshold crossings of the JF-statistic over the intrinsic 
parameter space. Once we have a threshold crossing we need to find the precise location of 
the maximum of T in order to estimate accurately the parameters of the signal. A satisfactory 
procedure is the two-step procedure. The first step is a coarse search where we evaluate T on a 
coarse grid in parameter space and locate threshold crossings. The second step, called fine search, 
is a refinement around the region of parameter space where the maximum identified by the coarse 
search is located. 

There are two methods to perform the fine search. One is to refine the grid around the threshold 
crossing found by the coarse search (TO] [68] [SI [11] , and the other is to use an optimization routine 
to find the maximum of T [49[ 160] . As initial value to the optimization routine we input the values 
of the parameters found by the coarse search. There are many maximization algorithms available. 
One useful method is the Nelder-Mead algorithm [61] which does not require computation of the 
derivatives of the function being maximized. 

4.8.2 Evaluation of the .F-statistic 

Usually the grid in parameter space is very large and it is important to calculate the optimum 
statistic as efficiently as possible. In special cases the J^-statistic given by Equation (|35p can be 
further simplified. For example, in the case of coalescing binaries J- can be expressed in terms of 
convolutions that depend on the difference between the time-of-arrival (TOA) of the signal and 
the TOA parameter of the filter. Such convolutions can be efficiently computed using Fast Fourier 
Transforms (FFTs). For continuous sources, like gravitational waves from rotating neutron stars 
observed by ground-based detectors [13] or gravitational waves form stellar mass binaries observed 
by space-borne detectors [60], the detection statistic T involves integrals of the general form 



where are the intrinsic parameters excluding the frequency parameter w, m is the amplitude 
modulation function, and w0mod the phase modulation function. The amplitude modulation func- 
tion is slowly varying comparing to the exponential terms in the integral ([50]) . We see that the 
integral (|80p can be interpreted as a Fourier transform (and computed efficiently with an FFT), 
if </>mod — and if m does not depend on the frequency u. In the long- wavelength approximation 
the amplitude function m does not depend on the frequency. In this case Equation ([80]) can be 
converted to a Fourier transform by introducing a new time variable t\j [87] . 



Thus in order to compute the integral (|80p . for each set of the intrinsic parameters we multiply 
the data by the amplitude modulation function m, resample according to Equation (j8ip . and 
perform the FFT. In the case of LISA detector data when the amplitude modulation m depends 




x{t) m{t; uj, exp I iw(/)i„od(i; S,^) ) exp{iujt) dt 



(80) 



th :=t + 0mod(i;f^). 



(81) 
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on frequency we can divide the data into several band-passed data sets, choosing the bandwidth 
for each set sufficiently small so that the change of mexp(ia;<^inod) is small over the band. In the 
integral (jSOp we can then use as the value of the frequency in the amplitude and phase modulation 
function the maximum frequency of the band of the signal (see [60) for details). 

4.8.3 Comparison with the Cramer Rao bound 

In order to test the performance of the maximization method of the T statistic it is useful to 
perform Monte Carlo simulations of the parameter estimation and compare the variances of the 
estimators with the variances calculated from the Fisher matrix. Such simulations were performed 
for various gravitational- wave signals [55l [191 US]- In these simulations we observe that above 
a certain signal-to-noise ratio, that we call the threshold signal-to-noise ratio, the results of the 
Monte Carlo simulations agree very well with the calculations of the rms errors from the inverse of 
the Fisher matrix. However, below the threshold signal-to-noise ratio they differ by a large factor. 
This threshold effect is well-known in signal processing [SS]. There exist more refined theoretical 
bounds on the rms errors that explain this effect, and they were also studied in the context 
of the gravitational- wave signal from a coalescing binary [72]. Use of the Fisher matrix in the 
assessment of the parameter estimators has been critically examined in [95] where a criterion has 
been established for signal-to- noise ratio above which the inverse of the Fisher matrix approximates 
well covariance of the estimators of the parameters. 

Here we present a simple model that explains the deviations from the covariance matrix and 
reproduces well the results of the Monte Carlo simulations. The model makes use of the concept 
of the elementary cell of the parameter space that we introduced in Section [4. 5. 2 1 The calculation 
given below is a generalization of the calculation of the rms error for the case of a monochromatic 
signal given by Rife and Boorstyn [81] . 

When the values of parameters of the template that correspond to the maximum of the func- 
tional T fall within the cell in the parameter space where the signal is present, the rms error is 
satisfactorily approximated by the inverse of the Fisher matrix. However, sometimes as a result 
of noise the global maximum is in the cell where there is no signal. We then say that an outlier 
has occurred. In the simplest case we can assume that the probability density of the values of the 
outliers is uniform over the search interval of a parameter, and then the rms error is given by 



where A is the length of the search interval for a given parameter. The probability that an outlier 
occurs will be the higher the lower the signal-to-noise ratio is. Let q be the probability that an 
outlier occurs. Then the total variance cP' of the estimator of a parameter is the weighted sum of 
the two errors 



where ucr is the rms errors calculated from the covariance matrix for a given parameter. One can 
show [49] that the probability q can be approximated by the following formula: 



where po and pi are the probability density functions of false alarm and detection given by Equa- 
tions (|46p and ([47]) . respectively, and where is the number of cells in the parameter space. 
Equation ([M]) is in good but not perfect agreement with the rms errors obtained from the Monte 
Carlo simulations (see [IH])- There are clearly also other reasons for deviations from the Cramer- 
Rao bound. One important effect (see [72]) is that the functional T has many local subsidiary 




(82) 



(83) 




(84) 
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maxima close to the global one. Thus for a low signal-to-noise the noise may promote the subsidiary 
maximum to a global one. 

4.9 Upper limits 

Detection of a signal is signified by a large value of the JF-statistic that is unlikely to arise from the 
noise-only distribution. If instead the value of T is consistent with pure noise with high probability 
we can place an upper limit on the strength of the signal. One way of doing this is to take the 
loudest event obtained in the search and solve the equation 

Pd(pul,^l) -/3 (85) 

for signal-to- noise ratio puL, where Pd is the detection probability given by Equation JFl is 
the value of the JF-statistic corresponding to the loudest event, and /? is a chosen confidence |15||T|. 
Then puL is the desired upper limit with confidence /?. 

When gravitational-wave data do not conform to a Gaussian probability density assumed in 
Equation ()49p . a more accurate upper limit can be obtained by injecting the signals into the 
detector's data and thereby estimating the probability of detection Pd ,3 . 

4.10 Network of detectors 

Several gravitational-wave detectors can observe gravitational waves from the same source. For 
example a network of bar detectors can observe a gravitational- wave burst from the same supernova 
explosion, or a network of laser interferometers can detect the inspiral of the same compact binary 
system. The space-borne LISA detector can be considered as a network of three detectors that 
can make three independent measurements of the same gravitational-wave signal. Simultaneous 
observations are also possible among different types of detectors, for example a search for supernova 
bursts can be performed simultaneously by bar and laser detectors [17] . 

We consider the general case of a network of detectors. Let h be the signal vector and let n be 
the noise vector of the network of detectors, i.e., the vector component /i^ is the response of the 
gravitational- wave signal in the fcth detector with noise rife. Let us also assume that each rik has 
zero mean. Assuming that the noise in all detectors is additive the data vector x can be written 
as 

x(t) n(t) +h(t). (86) 

In addition if the noise is a stationary, Gaussian, and continuous random process the log likelihood 
function is given by 

logA = (x|h)-i(h|h). (87) 
In Equation ([57)1 the scalar product ( • | • ) is defined by 

/•oo 

(x|y) 45R / i^S-^y* df, (88) 
Jo 

where S is the one-sided cross spectral density matrix of the noises of the detector network which 
is defined by (here E denotes the expectation value) 

E[n(/)n*T(/')] =ij(/_/')S(/). (89) 

The analysis is greatly simplified if the cross spectrum matrix S is diagonal. This means that the 
noises in various detectors are uncorrelated. This is the case when the detectors of the network are 
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in widely separated locations like for example the two LIGO detectors. However, this assumption 
is not always satisfied. An important case is the LISA detector where the noises of the three 
independent responses are correlated. Nevertheless for the case of LISA one can find a set of three 
combinations for which the noises are uncorrelated [781 180] . When the cross spectrum matrix is 
diagonal the optimum JF-statistic is just the sum of jF-statistics in each detector. 

A derivation of the likelihood function for an arbitrary network of detectors can be found in 39] , 
and applications of optimal filtering for the special cases of observations of coalescing binaries by 
networks of ground-based detectors are given in [151 1321 ITS] , for the case of stellar mass binaries 
observed by LISA space-borne detector in [60] , and for the case of spinning neutron stars observed 
by ground-based interferometers in [33]. The reduced Fisher matrix (Equationl43]) for the case of a 
network of interferometers observing spinning neutron stars has been derived and studied in [79] . 

A least square fit solution for the estimation of the sky location of a source of gravitational 
waves by a network of detectors for the case of a broad band burst was obtained in [15] . 

There is also another important method for analyzing the data from a network of detectors - 
the search for coincidences of events among detectors. This analysis is particularly important when 
we search for supernova bursts the waveforms of which are not very well known. Such signals can 
be easily mimicked by non-Gaussian behavior of the detector noise. The idea is to filter the data 
optimally in each of the detector and obtain candidate events. Then one compares parameters 
of candidate events, like for example times of arrivals of the bursts, among the detectors in the 
network. This method is widely used in the search for supernovae by networks of bar detectors [14] . 

4.11 Non-stationary, non-Gaussian, and non-linear data 

Equations (|34)) and ([35p provide maximum likelihood estimators only when the noise in which the 
signal is buried is Gaussian. There are general theorems in statistics indicating that the Gaussian 
noise is ubiquitous. One is the central limit theorem which states that the mean of any set of 
variates with any distribution having a finite mean and variance tends to the normal distribution. 
The other comes from the information theory and says that the probability distribution of a random 
variable with a given mean and variance which has the maximum entropy (minimum information) 
is the Gaussian distribution. Nevertheless, analysis of the data from gravitational-wave detectors 
shows that the noise in the detector may be non-Gaussian (see, e.g.. Figure 6 in [13 ). The noise 
in the detector may also be a non-linear and a non-stationary random process. 

The maximum likelihood method does not require that the noise in the detector is Gaussian 
or stationary. However, in order to derive the optimum statistic and calculate the Fisher matrix 
we need to know the statistical properties of the data. The probability distribution of the data 
may be complicated, and the derivation of the optimum statistic, the calculation of the Fisher 
matrix components and the false alarm probabilities may be impractical. There is however one 
important result that we have already mentioned. The matched-filter which is optimal for the 
Gaussian case is also a linear filter that gives maximum signal-to-noise ratio no matter what is the 
distribution of the data. Monte Carlo simulations performed by Finn [39] for the case of a network of 
detectors indicate that the performance of matched-filtering (i.e., the maximum likelihood method 
for Gaussian noise) is satisfactory for the case of non-Gaussian and stationary noise. 

Allen et al. [7] [5] derived an optimal (in the Neyman-Pearson sense, for weak signals) signal 
processing strategy when the detector noise is non-Gaussian and exhibits tail terms. This strategy is 
robust, meaning that it is close to optimal for Gaussian noise but far less sensitive than conventional 
methods to the excess large events that form the tail of the distribution. This strategy is based on 
an locally optimal test ([53]) that amounts to comparing a first non-zero derivative A„, 
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of the likelihood ratio with respect to the amplitude of the signal with a threshold instead of the 
likelihood ratio itself. 

In the remaining part of this section we review some statistical tests and methods to detect 
non-Gaussianity, non-stationarity, and non-linearity in the data. A classical test for a sequence 
of data to be Gaussian is the Kolmogorov-Smirnov test [2Tj . It calculates the maximum distance 
between the cumulative distribution of the data and that of a normal distribution, and assesses 
the significance of the distance. A similar test is the Lillifors test [27], but it adjusts for the fact 
that the parameters of the normal distribution are estimated from the data rather than specified 
in advance. Another test is the Jarque-Bera test [51] which determines whether sample skewness 
and kurtosis are unusually different from their Gaussian values. 

Let Xk and ui be two discrete in time random processes (— cxi < k,l < oo) and let ui be 
independent and identically distributed (i.i.d.). We call the process Xk linear if it can be represented 

by 

N 

Xk=^aiUk-i, (91) 

1=0 

where a; are constant coefficients. If ui is Gaussian (non-Gaussian), we say that xi is linear 
Gaussian (non-Gaussian). In order to test for linearity and Gaussianity we examine the third- 
order cumulants of the data. The third-order cumulant Cki of a zero mean stationary process is 
defined by 

Cm ■■= E [XjnXm+kXm+l] ■ (92) 

The bispectrum S'2(/i,/2) is the two-dimensional Fourier transform of Cki- The bicoherence is 
defined as 

where S{f) is the spectral density of the process Xk- If the process is Gaussian then its bispectrum 
and consequently its bicoherence is zero. One can easily show that if the process is linear then 
its bicoherence is constant. Thus if the bispectrum is not zero, then the process is non-Gaussian; 
if the bicoherence is not constant then the process is also non-linear. Consequently we have the 
following hypothesis testing problems: 

1 . Hi : The bispectrum of Xk is nonzero. 

2. Hq: The bispectrum of Xk is zero. 

If Hypothesis [Ijholds, we can test for linearity, that is, we have a second hypothesis testing problem: 

3. H']^: The bicoherence of Xk is not constant. 

4. H": The bicoherence of Xk is a constant. 

If Hypothesis m holds, the process is linear. 

Using the above tests we can detect non-Gaussianity and, if the process is non-Gaussian, non- 
linearity of the process. The distribution of the test statistic _B(/i,/2), Equation can be 
calculated in terms of distributions. For more details see ^5] . 

It is not difficult to examine non-stationarity of the data. One can divide the data into short 
segments and for each segment calculate the mean, standard deviation and estimate the spectrum. 
One can then investigate the variation of these quantities from one segment of the data to the other. 
This simple analysis can be useful in identifying and eliminating bad data. Another quantity to 
examine is the autocorrelation function of the data. For a stationary process the autocorrelation 
function should decay to zero. A test to detect certain non-stationarities used for analysis of 
econometric time series is the Dickey-Fuller test [25l. It models the data by an autoregressive 
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process and it tests whether values of the parameters of the process deviate from those allowed by 
a stationary model. A robust test for detection non-stationarity in data from gravitational-wave 
detectors has been developed by Mohanty p5 . The test involves applying Student's t-test to 
Fourier coefficients of segments of the data. 
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